1,540 research outputs found

    Optimal Exploration is no harder than Thompson Sampling

    Full text link
    Given a set of arms ZβŠ‚Rd\mathcal{Z}\subset \mathbb{R}^d and an unknown parameter vector ΞΈβˆ—βˆˆRd\theta_\ast\in\mathbb{R}^d, the pure exploration linear bandit problem aims to return arg⁑max⁑z∈ZzβŠ€ΞΈβˆ—\arg\max_{z\in \mathcal{Z}} z^{\top}\theta_{\ast}, with high probability through noisy measurements of xβŠ€ΞΈβˆ—x^{\top}\theta_{\ast} with x∈XβŠ‚Rdx\in \mathcal{X}\subset \mathbb{R}^d. Existing (asymptotically) optimal methods require either a) potentially costly projections for each arm z∈Zz\in \mathcal{Z} or b) explicitly maintaining a subset of Z\mathcal{Z} under consideration at each time. This complexity is at odds with the popular and simple Thompson Sampling algorithm for regret minimization, which just requires access to a posterior sampling and argmax oracle, and does not need to enumerate Z\mathcal{Z} at any point. Unfortunately, Thompson sampling is known to be sub-optimal for pure exploration. In this work, we pose a natural question: is there an algorithm that can explore optimally and only needs the same computational primitives as Thompson Sampling? We answer the question in the affirmative. We provide an algorithm that leverages only sampling and argmax oracles and achieves an exponential convergence rate, with the exponent being the optimal among all possible allocations asymptotically. In addition, we show that our algorithm can be easily implemented and performs as well empirically as existing asymptotically optimal methods

    Minimax Optimal Submodular Optimization with Bandit Feedback

    Full text link
    We consider maximizing a monotonic, submodular set function f:2[n]β†’[0,1]f: 2^{[n]} \rightarrow [0,1] under stochastic bandit feedback. Specifically, ff is unknown to the learner but at each time t=1,…,Tt=1,\dots,T the learner chooses a set StβŠ‚[n]S_t \subset [n] with ∣Stβˆ£β‰€k|S_t| \leq k and receives reward f(St)+Ξ·tf(S_t) + \eta_t where Ξ·t\eta_t is mean-zero sub-Gaussian noise. The objective is to minimize the learner's regret over TT times with respect to (1βˆ’eβˆ’11-e^{-1})-approximation of maximum f(Sβˆ—)f(S_*) with ∣Sβˆ—βˆ£=k|S_*| = k, obtained through greedy maximization of ff. To date, the best regret bound in the literature scales as kn1/3T2/3k n^{1/3} T^{2/3}. And by trivially treating every set as a unique arm one deduces that (nk)T\sqrt{ {n \choose k} T } is also achievable. In this work, we establish the first minimax lower bound for this setting that scales like O(min⁑i≀k(in1/3T2/3+nkβˆ’iT))\mathcal{O}(\min_{i \le k}(in^{1/3}T^{2/3} + \sqrt{n^{k-i}T})). Moreover, we propose an algorithm that is capable of matching the lower bound regret

    STUDY OF POWER FILTER TOPOLOGIES AND CONTROL MECHANISM

    Get PDF
    Power system comprises of threenatural / physical characteristics namely voltage,current and frequency. Deviation in these physicalcharacteristics are termed as power quality issues inpower system. Presence of nonlinear current ornonlinear/unbalanced voltages and frequencies aretermed as power quality issue. These (current, voltageand frequencies) deviations result in failure/damageof equipment in power system. Power electronicconverters associated with their nonlinear switchingcharacteristics and high frequency operation are themajor cause for power quality issues. In order toreduce harmonics and improve power quality, HybridActive Power Filter (HAPF) or shunt HAPF can beemployed. The power improvement can be done usingvarious algorithm like RLS algorithm, DC link voltagecontroller, fuzzy logic based hybrid filter

    Explicit computations of Hida families via overconvergent modular symbols

    Full text link
    In [Pollack-Stevens 2011], efficient algorithms are given to compute with overconvergent modular symbols. These algorithms then allow for the fast computation of pp-adic LL-functions and have further been applied to compute rational points on elliptic curves (e.g. [Darmon-Pollack 2006, Trifkovi\'c 2006]). In this paper, we generalize these algorithms to the case of families of overconvergent modular symbols. As a consequence, we can compute pp-adic families of Hecke-eigenvalues, two-variable pp-adic LL-functions, LL-invariants, as well as the shape and structure of ordinary Hida-Hecke algebras.Comment: 51 pages. To appear in Research in Number Theory. This version has added some comments and clarifications, a new example, and further explanations of the previous example
    • …
    corecore